Robot Navigation in Partially Observable Domains using Hierarchical Memory-Based Reinforcement Learning
نویسنده
چکیده
In this paper, we attempt to find a solution to the problem of robot navigation in a domain with partial observability. The domain is a grid-world with intersecting corridors, where the agent learns an optimal policy for navigation by making use of a hierarchical memory-based learning algorithm. We define a hierarchy of levels over which the agent abstracts the learning process, as well as its behaviour. The problem is modeled as a POMDP and a solution is obtained by implementing the SARSA algorithm, which incorporates Temporal Difference learning. The agent uses short-term memory and abstracts over minute details thereby enabling it to scale up to large partially observable domains.
منابع مشابه
Hierarchical Memory-Based Reinforcement Learning
A key challenge for reinforcement learning is scaling up to large partially observable domains. In this paper, we show how a hierarchy of behaviors can be used to create and select among variable length short-term memories appropriate for a task. At higher levels in the hierarchy, the agent abstracts over lower-level details and looks back over a variable number of high-level decisions in time....
متن کاملAnalysis of Memory-Based Learning Schemes for Robot Navigation in Discrete Grid-Worlds with Partial Observability
Abstract In this paper we tackle the problem of robot navigation in discrete grid-worlds using memory-based learning schemes. Different memory-based approaches are tested for navigating an agent across a discrete but partially observable world, and the significance of memory structure is examined. Further, the effects of additional memory hierarchies and multi-level learning frameworks are anal...
متن کاملAutonomous Navigation in Partially Observable Environments Using Hierarchical Q-Learning
A self-learning adaptive flight control design allows reliable and effective operation of flight vehicles in a complex environment. Reinforcement Learning provides a model-free, adaptive, and effective process for optimal control and navigation. This paper presents a new and systematic approach combining Q-learning and hierarchical reinforcement learning with additional connecting Q-value funct...
متن کاملAn Integrated Framework for Robust Human-Robot Interaction
Developments in sensor technology and sensory input processing algorithms have enabled the use of mobile robots in real-world domains. As they are increasingly deployed to interact with humans in our homes and offices, robots need the ability to operate autonomously based on sensory cues and high-level feedback from non-expert human participants. Towards this objective, this chapter describes a...
متن کاملDynamic Obstacle Avoidance by Distributed Algorithm based on Reinforcement Learning (RESEARCH NOTE)
In this paper we focus on the application of reinforcement learning to obstacle avoidance in dynamic Environments in wireless sensor networks. A distributed algorithm based on reinforcement learning is developed for sensor networks to guide mobile robot through the dynamic obstacles. The sensor network models the danger of the area under coverage as obstacles, and has the property of adoption o...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005